Given a particular embodiment, we propose a novel method (C3PO) that learns policies able to achieve any arbitrary position and pose. Such a policy would allow for easier control, and would be re-useable as a key building block for downstream tasks. The method is two-fold: First, we introduce a novel exploration algorithm that optimizes for uniform coverage, is able to discover a set of achievable states, and investigates its abilities in attaining both high coverage, and hard-to-discover states; Second, we leverage this set of achievable states as training data for a universal goal-achievement policy, a goal-based SAC variant. We demonstrate the trained policy's performance in achieving a large number of novel states. Finally, we showcase the influence of massive unsupervised training of a goal-achievement policy with state-of-the-art pose-based control of the Hopper, Walker, Halfcheetah, Humanoid and Ant embodiments.
translated by 谷歌翻译
我们研究可以从有限,凸面和良好的控制空间中生成任意自然语言文本(例如所有英语句子)的模型。我们称它们为通用Vec2Text模型。这样的模型将允许在矢量空间中做出语义决策(例如,通过强化学习),而自然语言产生由VEC2Text模型处理。我们提出了四个所需的特性:这种VEC2Text模型应具有的普遍性,多样性,流利性和语义结构,我们提供了定量和定性的方法来评估它们。我们通过将瓶颈添加到250m参数变压器模型中来实现VEC2Text模型,并通过从大型Web语料库中提取的400m句子(10B代币)对其进行自动编码目标进行训练。我们提出了一种基于往返翻译的简单数据增强技术,并在广泛的实验中表明,所得的VEC2Text模型令人惊讶地导致矢量空间,从而满足我们的四个所需属性,并且该模型强烈超过了标准和denoso自动编码器的表现。
translated by 谷歌翻译
常见的策略梯度方法依赖于代理函数序列的最大化。近年来,已经提出了许多这样的代理功能,大多数没有强烈的理论担保,导致TRPO,PPO或MPO等算法。我们而不是设计另一个代理函数,而是根据功能镜中的函数提出一般框架(FMA-PG),这导致了整个代理功能。我们构建了使策略改进保证能够担保的代理功能,这是由最现有的代理职能共享的属性。至关重要,无论政策参数化的选择如何,这些保证都会持有。此外,FMA-PG的特定实例化恢复了重要的实施启发式(例如,使用前向VS反向KL发散),导致TRPO的变体具有额外的理想性质。通过对简单强盗问题的实验,我们评估FMA-PG实例化的算法。拟议的框架还提出了一种改进的PPO变体,其鲁棒性和效率我们在Mujoco套件上证明。
translated by 谷歌翻译
The key idea behind the unsupervised learning of disentangled representations is that real-world data is generated by a few explanatory factors of variation which can be recovered by unsupervised learning algorithms. In this paper, we provide a sober look at recent progress in the field and challenge some common assumptions. We first theoretically show that the unsupervised learning of disentangled representations is fundamentally impossible without inductive biases on both the models and the data. Then, we train more than 12 000 models covering most prominent methods and evaluation metrics in a reproducible large-scale experimental study on seven different data sets. We observe that while the different methods successfully enforce properties "encouraged" by the corresponding losses, well-disentangled models seemingly cannot be identified without supervision. Furthermore, increased disentanglement does not seem to lead to a decreased sample complexity of learning for downstream tasks. Our results suggest that future work on disentanglement learning should be explicit about the role of inductive biases and (implicit) supervision, investigate concrete benefits of enforcing disentanglement of the learned representations, and consider a reproducible experimental setup covering several data sets.
translated by 谷歌翻译
Recent advances in generative modeling have led to an increased interest in the study of statistical divergences as means of model comparison. Commonly used evaluation methods, such as the Fréchet Inception Distance (FID), correlate well with the perceived quality of samples and are sensitive to mode dropping. However, these metrics are unable to distinguish between different failure cases since they only yield one-dimensional scores. We propose a novel definition of precision and recall for distributions which disentangles the divergence into two separate dimensions. The proposed notion is intuitive, retains desirable properties, and naturally leads to an efficient algorithm that can be used to evaluate generative models. We relate this notion to total variation as well as to recent evaluation metrics such as Inception Score and FID. To demonstrate the practical utility of the proposed approach we perform an empirical study on several variants of Generative Adversarial Networks and Variational Autoencoders. In an extensive set of experiments we show that the proposed metric is able to disentangle the quality of generated samples from the coverage of the target distribution.
translated by 谷歌翻译
This report summarizes the work carried out by the authors during the Twelfth Montreal Industrial Problem Solving Workshop, held at Universit\'e de Montr\'eal in August 2022. The team tackled a problem submitted by CBC/Radio-Canada on the theme of Automatic Text Simplification (ATS).
translated by 谷歌翻译
Wearable sensors for measuring head kinematics can be noisy due to imperfect interfaces with the body. Mouthguards are used to measure head kinematics during impacts in traumatic brain injury (TBI) studies, but deviations from reference kinematics can still occur due to potential looseness. In this study, deep learning is used to compensate for the imperfect interface and improve measurement accuracy. A set of one-dimensional convolutional neural network (1D-CNN) models was developed to denoise mouthguard kinematics measurements along three spatial axes of linear acceleration and angular velocity. The denoised kinematics had significantly reduced errors compared to reference kinematics, and reduced errors in brain injury criteria and tissue strain and strain rate calculated via finite element modeling. The 1D-CNN models were also tested on an on-field dataset of college football impacts and a post-mortem human subject dataset, with similar denoising effects observed. The models can be used to improve detection of head impacts and TBI risk evaluation, and potentially extended to other sensors measuring kinematics.
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
Scene understanding is a major challenge of today's computer vision. Center to this task is image segmentation, since scenes are often provided as a set of pictures. Nowadays, many such datasets also provide 3D geometry information given as a 3D point cloud acquired by a laser scanner or a depth camera. To exploit this geometric information, many current approaches rely on both a 2D loss and 3D loss, requiring not only 2D per pixel labels but also 3D per point labels. However obtaining a 3D groundtruth is challenging, time-consuming and error-prone. In this paper, we show that image segmentation can benefit from 3D geometric information without requiring any 3D groundtruth, by training the geometric feature extraction with a 2D segmentation loss in an end-to-end fashion. Our method starts by extracting a map of 3D features directly from the point cloud by using a lightweight and simple 3D encoder neural network. The 3D feature map is then used as an additional input to a classical image segmentation network. During training, the 3D features extraction is optimized for the segmentation task by back-propagation through the entire pipeline. Our method exhibits state-of-the-art performance with much lighter input dataset requirements, since no 3D groundtruth is required.
translated by 谷歌翻译
An eco-system of agents each having their own policy with some, but limited, generalizability has proven to be a reliable approach to increase generalization across procedurally generated environments. In such an approach, new agents are regularly added to the eco-system when encountering a new environment that is outside of the scope of the eco-system. The speed of adaptation and general effectiveness of the eco-system approach highly depends on the initialization of new agents. In this paper we propose different techniques for such initialization and study their impact. We then rework the ecosystem setup to use forked agents which brings better results than the initial eco-system approach with a drastically reduced number of training cycles.
translated by 谷歌翻译